Skip to content

Conversation

@pggPL
Copy link
Collaborator

@pggPL pggPL commented Jan 28, 2026

Description

Fixes build issue introduced with #2502 - that PR had incorrect minimal version of cublas and it results in build fails on some containers.

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Signed-off-by: Pawel Gadzinski <[email protected]>
@pggPL
Copy link
Collaborator Author

pggPL commented Jan 28, 2026

/te-ci

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 28, 2026

Greptile Overview

Greptile Summary

This PR corrects the minimum cuBLAS version requirement for grouped GEMM from 13.1.0 to 13.2.0, fixing build failures introduced in PR #2502.

Changes:

  • Updated compile-time version checks from CUBLAS_VERSION >= 130100 to CUBLAS_VERSION >= 130200
  • Updated runtime version checks from cublas_version() >= 130100 to cublas_version() >= 130200
  • Updated error messages and comments to reference cuBLAS 13.2+ instead of 13.1+
  • Fixed namespace qualification for cuda:: function calls to use transformer_engine::cuda::

Issue Found:

  • Line 642 in cublaslt_grouped_gemm.cu error message still says "upgrade to CUDA 13.1 or newer" but should say "CUDA 13.2 or newer"

Confidence Score: 4/5

  • Safe to merge once the error message at line 642 is corrected
  • The PR correctly fixes the version check from 13.1.0 to 13.2.0 across all code paths, but contains one inconsistent error message that references CUDA 13.1 instead of 13.2
  • transformer_engine/common/gemm/cublaslt_grouped_gemm.cu line 642 needs correction

Important Files Changed

Filename Overview
transformer_engine/common/gemm/cublaslt_grouped_gemm.cu Updated version checks from 13.1.0 to 13.2.0, but error message at line 642 still references CUDA 13.1
transformer_engine/common/include/transformer_engine/gemm.h Updated documentation to reference cuBLAS 13.2+, but still mentions CUDA 13.1+ which may be inconsistent

Sequence Diagram

sequenceDiagram
    participant User as User Code
    participant API as nvte_grouped_gemm
    participant Check as Version Check
    participant cuBLAS as cuBLAS 13.2+
    
    User->>API: Call nvte_grouped_gemm()
    API->>Check: Verify CUBLAS_VERSION >= 130200
    alt Compile-time check fails
        Check-->>User: Error: Requires cuBLAS 13.2+
    else Compile-time check passes
        API->>Check: Verify runtime cublas_version() >= 130200
        alt Runtime check fails
            Check-->>User: Error: Runtime cuBLAS too old
        else Runtime check passes
            API->>Check: Verify SM arch >= 100 (Blackwell)
            alt Architecture check fails
                Check-->>User: Error: Requires Blackwell or newer
            else All checks pass
                API->>cuBLAS: Execute grouped GEMM
                cuBLAS-->>API: Result
                API-->>User: Success
            end
        end
    end
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@pggPL
Copy link
Collaborator Author

pggPL commented Jan 28, 2026

/te-ci jax L0

@pggPL pggPL added the 2.12.0 label Jan 28, 2026
Signed-off-by: Pawel Gadzinski <[email protected]>
@pggPL
Copy link
Collaborator Author

pggPL commented Jan 29, 2026

/te-ci pytorch

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

NVTE_CHECK(transformer_engine::cuda::sm_arch(current_device) >= 100,
"nvte_grouped_gemm requires Blackwell (SM100) or newer architecture.");
NVTE_CHECK(cuda::cublas_version() >= 130200,
NVTE_CHECK(transformer_engine::cuda::cublas_version() >= 130200,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't there be more than one places to add this transformer_engine::?

@pggPL
Copy link
Collaborator Author

pggPL commented Jan 29, 2026

/te-ci

1 similar comment
@pggPL
Copy link
Collaborator Author

pggPL commented Jan 29, 2026

/te-ci

Signed-off-by: Pawel Gadzinski <[email protected]>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Pawel Gadzinski <[email protected]>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Pawel Gadzinski <[email protected]>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Pawel Gadzinski <[email protected]>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

cudaStream_t stream) {
NVTE_ERROR("nvte_grouped_gemm requires cuBLAS 13.1+, but compile-time cuBLAS version is ",
NVTE_ERROR("nvte_grouped_gemm requires cuBLAS 13.2+, but compile-time cuBLAS version is ",
CUBLAS_VERSION, ". Please upgrade to CUDA 13.1 or newer.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error message says "upgrade to CUDA 13.1 or newer" but should say "CUDA 13.2 or newer" to match the cuBLAS 13.2+ requirement

Suggested change
CUBLAS_VERSION, ". Please upgrade to CUDA 13.1 or newer.");
CUBLAS_VERSION, ". Please upgrade to CUDA 13.2 or newer.");

@pggPL pggPL merged commit c3769cb into NVIDIA:main Jan 30, 2026
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants